Learning with fuzzy labels : a random set approach towards intelligent data mining systems
نویسنده
چکیده
Traditionally machine learning and data mining research has focused on learning algorithms with high classification or prediction accuracy. This is not always sufficient for some applications that require good algorithm transparency. In this thesis, we apply a random set based high-level knowledge representation language to some well-known data mining algorithms in order to build transparent models. This new framework is referred to as Label Semantics, which captures the idea of computation on linguistic expressions rather than numerical quantities. Based on this framework, we proposed linguistic decision tree, fuzzy Bayesian estimation tree and linguistic FOIL algorithms. Empirical experiments on real-world data sets suggest that our algorithms have better or equivalent classification (or prediction) accuracy and much better transparency compared to other well known machine learning algorithms such as C4.5, naive Bayes, neural networks and support vector machines.
منابع مشابه
Collaborative rule generation: An ensemble learning approach
Due to the vast and rapid increase in data, data mining has become an increasingly important tool for the purpose of knowledge discovery in order to prevent the presence of rich data but poor knowledge. Data mining tasks can be undertaken in two ways, namely, manual walkthrough of data and use of machine learning approaches. Due to the presence of big data, machine learning has thus become a po...
متن کاملOptimizing Membership Functions using Learning Automata for Fuzzy Association Rule Mining
The Transactions in web data often consist of quantitative data, suggesting that fuzzy set theory can be used to represent such data. The time spent by users on each web page is one type of web data, was regarded as a trapezoidal membership function (TMF) and can be used to evaluate user browsing behavior. The quality of mining fuzzy association rules depends on membership functions and since t...
متن کاملKnowledge Discovery in a Framework for Modelling with Words
The learning of transparent models is an important and neglected area of data mining. The data mining community has tended to focus on algorithm accuracy with little emphasis on the knowledge representation framework. However, the transparency of a model will help practitioners greatly in understanding the trends and idea hidden behind the system. In this chapter, a random set based knowledge r...
متن کاملDistributed Data mining In Wireless Sensor Network Using Fuzzy Naïve Byes
The wireless sensor nodes are getting smaller, but Wireless Sensor Networks (WSNs) are getting larger with the technological developments, currently containing thousands of nodes and possibly millions of nodes in the future. Therefore, effective and trustworthy event detection methods for the WSN require robust and intelligent methods of mining hidden patterns in the sensor data, while supporti...
متن کاملMINING FUZZY TEMPORAL ITEMSETS WITHIN VARIOUS TIME INTERVALS IN QUANTITATIVE DATASETS
This research aims at proposing a new method for discovering frequent temporal itemsets in continuous subsets of a dataset with quantitative transactions. It is important to note that although these temporal itemsets may have relatively high textit{support} or occurrence within particular time intervals, they do not necessarily get similar textit{support} across the whole dataset, which makes i...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005